50 research outputs found

    The Dynamical Functional Particle Method for Multi-Term Linear Matrix Equations

    Get PDF
    Recent years have seen a renewal of interest in multi-term linear matrix equations, as these have come to play a role in a number of important applications. Here, we consider the solution of such equations by means of the dynamical functional particle method, an iterative technique that relies on the numerical integration of a damped second order dynamical system. We develop a new algorithm for the solution of a large class of these equations, a class that includes, among others, all linear matrix equations with Hermitian positive definite or negative definite coefficients. In numerical experiments, our MATLAB implementation outperforms existing methods for the solution of multi-term Sylvester equations. For the Sylvester equation AX + XB = C, in particular, it can be faster and more accurate than the built-in implementation of the Bartels–Stewart algorithm, when A and B are well conditioned and have very different size

    A Backward/Forward Recovery Approach for the Preconditioned Conjugate Gradient Algorithm

    Get PDF
    Several recent papers have introduced a periodic verification mechanism to detect silent errors in iterative solvers. Chen [PPoPP'13, pp. 167--176] has shown how to combine such a verification mechanism (a stability test checking the orthogonality of two vectors and recomputing the residual) with checkpointing: the idea is to verify every dd iterations, and to checkpoint every cĂ—dc \times d iterations. When a silent error is detected by the verification mechanism, one can rollback to and re-execute from the last checkpoint. In this paper, we also propose to combine checkpointing and verification, but we use algorithm-based fault tolerance (ABFT) rather than stability tests. ABFT can be used for error detection, but also for error detection and correction, allowing a forward recovery (and no rollback nor re-execution) when a single error is detected. We introduce an abstract performance model to compute the performance of all schemes, and we instantiate it using the preconditioned conjugate gradient algorithm. Finally, we validate our new approach through a set of simulations

    Coping with silent errors in HPC applications

    Get PDF
    This report describes a unified framework for the detection and correction of silent errors,which constitute a major threat for scientific applications at extreme-scale.We first motivate the problem andexplain why checkpointing must be combined with some verification mechanism.Then we introduce a general-purpose technique based upon computational patterns thatperiodically repeat over time. These patterns interleave verifications and checkpoints, and we show how to determine the pattern minimizing expected execution time.Then we move to application-specific techniques and review dynamic programming algorithms for linear chains of tasks, as well as ABFT-oriented algorithms for iterative methods in sparse linear algebra

    Optimality of the Paterson-Stockmeyer Method for Evaluating Matrix Polynomials and Rational Matrix Functions

    No full text
    Many state-of-the-art algorithms reduce the computation of transcendental matrix functions to the evaluation of polynomial or rational approximants at a matrix argument. This task can be accomplished efficiently by recurring to the Paterson-Stockmeyer method, an evaluation scheme originally developed for matrix polynomials that extends quite naturally to rational functions. An important feature of these techniques is that the number of matrix multiplications required to evaluate an approximant of order n grows slower than n itself, with the result that different approximants yield the same asymptotic computational cost. We analyze the number of matrix multiplications required by the Paterson-Stockmeyer method and by two widely used generalizations, one for evaluating diagonal Padé approximants of generic functions and one specifically tailored to those of the exponential. In all three cases, we identify the approximants of maximum order for any given computational cost

    Weighted geometric mean of large-scale matrices: numerical analysis and algorithms

    No full text
    Computing the weighted geometric mean of large sparse matrices is an operation that tends to become rapidly intractable, when the size of the matrices involved grows. However, if we are not interested in the computation of the matrix function itself, but just in that of its product times a vector, the problem turns simpler and there is a chance to solve it even when the matrix mean would actually be impossible to compute. Our interest is motivated by the fact that this calculation has some practical applications, related to the preconditioning of some operators arising in domain decomposition of elliptic problems. In this thesis, we explore how such a computation can be efficiently performed. First, we exploit the properties of the weighted geometric mean and find several equivalent ways to express it through real powers of a matrix. Hence, we focus our attention on matrix powers and examine how well-known techniques can be adapted to the solution of the problem at hand. In particular, we consider two broad families of approaches for the computation of f(A) v, namely quadrature formulae and Krylov subspace methods, and generalize them to the pencil case f(A\B) v. Finally, we provide an extensive experimental evaluation of the proposed algorithms and also try to assess how convergence speed and execution time are influenced by some characteristics of the input matrices. Our results suggest that a few elements have some bearing on the performance and that, although there is no best choice in general, knowing the conditioning and the sparsity of the arguments beforehand can considerably help in choosing the best strategy to tackle the problem

    Sampling the Eigenvalues of Random Orthogonal and Unitary Matrices

    No full text
    We develop an efficient algorithm for sampling the eigenvalues of random matrices distributed according to the Haar measure over the orthogonal or unitary group. Our technique samples directly a factorization of the Hessenberg form of such matrices, and then computes their eigenvalues with a tailored core-chasing algorithm. This approach requires a number of floating-point operations that is quadratic in the order of the matrix being sampled, and can be adapted to other matrix groups. In particular, we explain how it can be used to sample the Haar measure over the special orthogonal and unitary groups and the conditional probability distribution obtained by requiring the determinant of the sampled matrix be a given complex number on the complex unit circle

    CPFloat: A C library for simulating low-precision arithmetic

    No full text
    One can simulate low-precision floating-point arithmetic via software by executing each arithmetic operation in hardware and then rounding the result to the desired number of significant bits. For IEEE-compliant formats, rounding requires only standard mathematical library functions, but handling subnormals, underflow, and overflow demands special attention, and numerical errors can cause mathematically correct formulae to behave incorrectly in finite arithmetic. Moreover, the ensuing implementations are not necessarily efficient, as the library functions these techniques build upon are typically designed to handle a broad range of cases and may not be optimized for the specific needs of rounding algorithms. CPFloat is a C library for simulating low-precision arithmetics. It offers efficient routines for rounding, performing mathematical computations, and querying properties of the simulated low-precision format. The software exploits the bit-level floating-point representation of the format in which the numbers are stored, and replaces costly library calls with low-level bit manipulations and integer arithmetic. In numerical experiments, the new techniques bring a considerable speedup (typically one order of magnitude or more) over existing alternatives in C, C++, and MATLAB. To our knowledge, CPFloat is currently the most efficient and complete library for experimenting with custom low-precision floating-point arithmetic available in any language
    corecore